One-Sample Speech Recognition of Mandarin Monosyllables using Unsupervised Learning

نویسندگان

  • Tze Fen Li
  • Shui-Ching Chang
چکیده

In the speech recognition, a mandarin syllable wave is compressed into a matrix of linear predict coding cepstra (LPCC), i.e., a matrix of LPCC represents a mandarin syllable. We use the Bayes decision rule on the matrix to identify a mandarin syllable. Suppose that there are K different mandarin syllables, i.e., K classes. In the pattern classification problem, it is known that the Bayes decision rule, which separates K classes, gives a minimum probability of misclassification. In this study, a set of unknown syllables is used to learn all unknown parameters (means and variances) for each class. At the same time, in each class, we need one known sample (syllable) to identify its own means and variances among K classes. Finally, the Bayes decision rule classifies the set of unknown syllables and input unknown syllables. It is an one-sample speech recognition. This classifier can adapt itself to a better decision rule by making use of new unknown input syllables while the recognition system is put in use. In the speech experiment using unsupervised learning to find the unknown parameters, the digit recognition rate is improved by 22%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Duration of Mandarin Tones

The present study compared the duration of Mandarin tones in three types of speech contexts: isolated monosyllables, formal text-reading passages, and casual conversations. A total of 156 adult speakers was recruited. The speech materials included 44 monosyllables recorded from each of 121 participants, 18 passages read by 2 participants, and 20 conversations conducted by 33 participants. The d...

متن کامل

Unsupervised Learning of Tone and Pitch Accent

Recognition of tone and intonation is essential for speech recognition and language understanding. However, most approaches to this recognition task have relied upon extensive collections of manually tagged data obtained at substantial time and financial cost. In this paper, we explore unsupervised clustering approaches to recognize pitch accent in English and tones in Mandarin Chinese. In unsu...

متن کامل

Unsupervised and Semi-supervised Learning of Tone and Pitch Accent

Recognition of tone and intonation is essential for speech recognition and language understanding. However, most approaches to this recognition task have relied upon extensive collections of manually tagged data obtained at substantial time and financial cost. In this paper, we explore two approaches to tone learning with substantially reductions in training data. We employ both unsupervised cl...

متن کامل

Mandarin Chinese Broadcast News Retrieval and Summarization Using Probabilistic Generative Models

This paper presents our recent research work on applying probabilistic generative models to Mandarin Chinese broadcast news retrieval and summarization. Most models can be trained in either a supervised or unsupervised manner. In addition, both literal term matching and concept matching strategies have been intensively investigated. This paper also presents a prototype web-based Mandarin Chines...

متن کامل

Unsupervised prosody labeling for constructing Mandarin TTS

This paper introduces an unsupervised prosody labeling method for preparing a large speech corpus used in developing a Mandarin Text-to-Speech system. Adopting a four-layer prosody hierarchy, the proposed method performs an unsupervised segmental clustering that iteratively segments spoken utterances into strings of prosodic constituents and models the patterns of the segmented prosodic constit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008